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Is it better to use radiances or retrievals in data assimilation? I started out my career believing 



in radiance assimilation (Hoffman 1983; Hoffman and Nehrkorn 1989), but now, with radiance 



assimilation having become mainstream, I'm having second thoughts. Here I outline an approach 
following Rodgers (2000) that might have some advantages that uses retrievals along with the 
so-called Jacobian or sensitivity matrix, actual averaging kernel (AK), actual prior, and estimated 
retrieval noise covariance. (See §HJ) Others have considered using retrievals along with other 
information provided by the retrieval process ( poiner and da Silva|1998| ). Eyre and coworkers used 
ld-VAR retrievals within 3d-VAR and 4d-VAR assimilation systems (Eyre et al. 1993; Szyndel 
et al.|2004|[Pavelin et al.|2008| ). 

Using retrievals gives us several advantages (Rodge rs|2000[ Section 8.3). First it reduces com- 
plexity in the //-operator, reduces data volume, allows sequential processing within the retrieval 
algorithm, allows arbitrary cloud clearing methods, and makes the assimilation system more mod- 
ular. (Sequential processing means, for example, first use one band for temperature retrievals, then 
use some other band for humidity retrievals, etc.^ (More in §|2j) Using EOFs gives us additional 
advantages, including reduced data volume. (See § 8j) There is also the potential to reduce vertical 
interpolation errors that arise due to the different vertical grids used in the forward problem and 
the meteorological model. (See §[9j) Note that if the retrieval method uses an EOF representation, 
then formally the error covariance matrix in physical space is not full rank. (But see below in 
the discussion of "EOF covariances" where we describe how we create the physical space error 
covariance matrix at AER.) Finally the AK provides the opportunity to remove the influence of 
the prior, and allows for interactive retrievals (where, for example, the prior is the ensemble mean 
forecast). (See §|5j) 



^Contact information: Dr. Ross N. Hoffman, Atmospheric and Environmental Research, Inc., 131 Hartwell Av- 
enue, Lexington, MA 02421-3126 Email: ross.n.hoffman@aer.com. 

2 Why do retrieval people like to do this? I think because of uncertainty in the spectroscopy and in estimating errors, 
and because the optimal coupled problem is so sensitive. 
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The motivation for this approach is for instruments with thousands of channels. This includes 
AIRS, IASI and the Aura TES. In these cases, instead of using thousands of channels or having 
to decide which are the optimal channels to use in the DAS, these issues would be pushed back to 
the retrieval scheme. But this approach can be applied more generally, and has the advantage of 
explicitly treating correlations of errors for a single retrieved profile, with only modest impacts on 
the DAS. 

There are three successive developments in what follows. First, in §[4j we show how to use 
the AK and the retrieval noise covariance to transform the retrieved quantities into observations 
that are unbiased and have uncorrected errors, and to eliminate both the smoothing inherent in 
the retrieval process and the effect of the prior. Second, in §[8j we show how to transform this 
result into EOF space, when a truncated EOF series has been used in the retrieval process. This 
provides a degree of data compression and eliminates those transformed variables that have very 
small information content. In both approaches a vertical interpolation from the dynamical model 
coordinate to the radiative transfer coordinate is required. In the third development, in §[9| we 
propose using the EOFs to essentially reduce the error of the vertical interpolation. 



2. Radiance assimilation :: pros and cons 

The advantages of direct radiance assimilation are that we use the data in its basic state and there 
are weak or no error correlations. The disadvantages are that there may be huge numbers of 
channels, and that what I've termed geophysical biases can effectively induce observation error 
correlations. For example, the meteorological model might have a very simple representation of 
surface emissivity, while the true surface emissivity is governed by geologic parameters that have 
long length scales. This difference must be considered an observational error in data assimilation 
systems (DAS)j^A retrieval scheme might employ a slack variable (e.g., cloud fraction) or use 
information from some channels to model the emissivity in other channels. Further, spectroscopic 
inconsistencies may introduce correlations in the forward problem, which enter the problem in the 
same way as observational error correlations. In the end, I expect that whatever can be done in a 
retrieval scheme can probably be done in DAS if we are clever enough, but at a cost in complexity 
and maintenance. I believe it is best to avoid the use of radiances in the DAS and instead use 
retrievals with a strategy along the lines outlined here whenever an intricate retrieval schemes is 
favored by the remote sensing community to extract the maximum information content from the 
observations. 



3. Notation 



As we will quote so many results from Rodgers (2000), we will adopt his notation, defining terms 



as we go. However, this means that x discussed here, the vector of retrieved quantitie^J become 



3 Like "fish" [one fish, two fish . . . ] "DAS" is both singular and plural. 
4 Typically a temperature profile. 
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the observations, usually denoted y in an EnKF or 4d-VAR DAS context, while at the same time y 
in[Rodgers (2000) denotes the channel radiances. Moreover in what follows we propose that we 
use a transformation of x that involves the AK (A) as an observation in the DAS, so to bridge the 
notation gap, we will denote this transformation as yA- 



4. Averaging kernel method 



In theory (e.g., Rodgers 2000), a significant fraction of retrieval error results from the fact that the 
retrievals can not observe the detailed vertical structure of the atmosphere and prior information 
is used to fill some of the gaps, but the result is smoothed compared to reality. When used as a 
profile, this corresponds to correlated errors. Also, using the best possible prior (the forecast!) is 



considered a no-no by many. |Rodgers| ( |2000[ Section 8.3.4) suggests a one-sentence way around 
this. He leaves a few details for the student to fill in, but most of what we need is scattered 
throughout his excellent book. To begin, Rodgers (2000, Eq. 3.12) writes the retrieval as 



AX+(l-A)X a + Gy£y. 



(1) 



Here x is the retrieval, x is the true state vector, and x a is the prior. The matrix A is the averaging 
kernel. The total measurement error relative to the forward problem, e y ( [Rodgers|2000 Eq. 3.1 1), 
includes measurement error and forward model error and is usually characterized as being unbiased 
and having a specified covariance matri x, S e ^ I n Eq. ([!]) the matrix G y is the sensitivity of the 
retrieval to the radiances and is given by Rodgers (2000, Eqs. 3.25, 3.27) as 



dx 
dy 



SK T S~\ 



(2) 



Here K = ^ is the Jacobian of the forward problem, F, evaluated at the solution x (Rodgers 
Eq. 2.20), and 



2000 



K T S £ l K + S a 1 



(3) 



defines the accuracy (inverse of the error covariance) of the retrieval in terms of the accuracy of 



G y y and the accuracy of x a (Rodgers 20001 Eqs. 2.27, 4.13). An alternative computational formula 
for G y , given by|Rodgers|(|2000[ Eqs. 2A5) is 



with 



S a K T S q 1 



KS a K + iSj 



(4) 



(5) 



in which S q should be well conditioned due to the presence of S £ r\ With this notation we may write 



A = G y K = S{K l S £ L K) =I-SS, 



(6) 



5 In practice S £ is often taken to be diagonal, but that is not material in this analysis. We return to a discussion of 
S e i n §g} 



S ^GySq - 



f Rod gersl (12000 Section 4.1) demonstrates the algebraic equivalence of these two forms. In brief, find that 



K {I + S- l KS a K l ) for both forms of G y . 
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following [Rodgers| ( |2000[ Eqs. 2.78, 2.79)0 



Within a generalized DAS, where any observation can be used if it can be calculated, this 
formulation of the retrieval can be used to eliminate the impact of the prior. For this purpose we 
posit a pre-processing step that removes the prior and effectively removes smoothing as an error 
source (Rodger s]2000[ Section 8.3.4) by defining the new observation as 



y A = x - (I - A)x a , (7) 

and the new observation operator by 

y A =Ax. (8) 

The retrievals are of course still smooth, but we will now be comparing observed and simulated 
quantities with the same degree of smoothing. Now the observation increments 

h = Gy£y, (9) 

are unbiased if the e y are unbiased, and have covariance 

S m = G y S e Gl. (10) 



Rodgers]( |2000[ Eq. 3.19) calls this the retrieval noise covariance. An alternative form for Eq. ([TO 



is 

S m =A§, (11) 



which follows from substituting Eq. ([2]) into Eq. ( 10), and then making use of the second form of 
Eq. 



The end result is the same situation that is always obtained in maximum a posteriori (MAP) 
retrieval schemes — you compare an observation (i — (/ — A)x a ) to a simulated value (Ax) such that 

the mean difference has zero expectation and you normalize the difference by the square root of its 

_i 

inverse covariance (5 m 2 ). 

In this approach, the //-operator for this observation is simply Ax, but remember that x is 
defined in the space of the forward radiative transfer model. Therefore the observation operator will 
in general require time interpolation followed by interpolation in latitude and longitude, followed 
by vertical interpolation to the vertical grid of the forward problem and finally the computation of 
yA according to Eq. ([8]). Then, each element of the observation vector, y A (i), will be compared to 
the zth element of the simulated vector y A (i)- For convenience we might store the elements of the 
observation vector individually, and compute the elements of the simulated vector individually as 
the dot product of the ith row of A and x. Practically then, the individual elements of the observation 
vector, y A (i), might be stored along with the corresponding rows of A. The rows of A might also 
be used in determining vertical localization. Note that the preceeding arguments formally depend 
on the correctness of Eq. ([T]), including the implicit linearization assumption. 



7 To see the equivalence of the second two forms, note that / = SS~ l = S(K T S^ l K + S^ 1 ) according to Eq. 0, and 
subtract the second form to find that I — A = SS~ 1 . 

8 Although A is not symmetric, AS is symmetric since S m is symmetric. Explicitly (AS) T = S T A T = 
S(K T S- l K) T S = S(K T S- 1 K)S = AS. 
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If we scale by S m 2 we effectively rotate to a space where the observation errors are unbiased, 
uncorrelated, and have unit variance. Specifically we redefine the new observation as 

y A = S;J(x-(I-A)x a ), (12) 

and the new observation operator by 

_ i 

y A = S m 2 Ax = A R x, (13) 
where we have defined the "rotated" AK, 

A R = S~h, (14) 
for convenience. Now the observation increments 

_i 

$A~yA =S m 2 G y £ yi (15) 

are unbiased, and have covariance Irj 



5. Forecast as prior 



After the pre-processing of Eq. ( 12 ), x a is no longer needed. Since the observation increment does 
not depend on the prior, we can use any prior. One possibility is to use the ensemble background 
mean as x a and the ensemble sample covariance as S a . We would call this the interactive retrieval 
approach. This complicates the data flow since now the background ensemble interpolated to 
the observation time, location, and vertical structure must be provided to the retrieval scheme. 
However, this is the best possible prior for the retrieval within the context of data assimilation. 
Further, if the system has settled down, then the ensemble background mean should be a good 
estimate of the truth and the linearization assumption of Eq. ([T|) should be a better approximation 
than otherwise. 



6. Forward problem errors 

The correct specification of the statistics of the combined measurement and forward problem er- 
rors, S £ , is crucial to the method. Usually we think of S £ as something specified by the retrieval- 
performing organization. However, there are two different approaches to determining S £ , and the 
second one involves the data assimilation system. The first, what might be called a bottom-up ap- 
proach, tries to estimate all the sources of error separately and build up a composite estimate of S £ . 
The second, or top-down approach, is based on comparisons of radiances and simulated radiances, 
most conveniently within the context of a radiance data assimilation system. 

In the bottom-up approach the following sources of error should be considered: 

_ i _ i j _ i _ i j 

9 Since S m is symmetric, we have ((y A -yA)($A -Ja) 7 ) = S m 2 G y (e y eJ)Gy S m J = S m 2 S m S m J = I. 
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• Measurement error including calibration error, misrepresentation of the instrument response 
function, sources of stray radiation, geometry error, geolocation error, polarization charac- 
terization, . . . ; 

• Spectroscopy errors (in line widths, strengths, continuum, . . . ); 

• Neglect of NLTE effects; 

• Neglect or errors in how the magnetic field interacts with radiation, including Zeeman line- 
splitting, Faraday rotation, . . . ; 

• Integration errors, i.e., truncation errors from the vertical grid; 

• EOF truncation errors; 

• Cloud clearing errors and beam-filling effects; and 

• Errors in physical properties that must be specified, but are not part of the control vector. 

The last item might include aerosol, cloud, and surface properties as well as trace gas absorber 
amounts, such as CO2 amounts, and the cosmic background in the microwave. All of these effects 
could be estimated in simulation. Estimating cloud clearing errors is likely the most difficult. 



Among the top-down approaches, those derived from the method of Desroziers et al. (20051 



seem most promising. Desroziers et al. (2005) combine differences among the observations, the 



analysis, and the background, all in observation space to estimate the analysis, background, and 
observation error covariances — again, all in observation space. To use this approach to estimate the 
forward problem error would require the radiance observations and the background and analysis 
evaluated as radiances. For discussion here, let x b be the background for one of the ensemble 
members interpolated to the observation time and location, and to the vertical grid of the forward 
problem 10 Then we can estimate the corresponding background radiance y b as 



y b = y + K(x b -x) 



(16) 



where y = F(x) is the radiance associated with the retrieval. The ensemble mean of y b would 



be the appropriate background in observation space for the pesroziers et al.| ( |2005j ) method. The 
same approach applies to the analysis radiance, except to note that the analysis is normally only 
available at the analysis time. However, in the LETKF approach, the analysis weights can be 
applied at each time (say each hour) to define the analysis at the same times as the background. Or 
the analysis weights could be applied directly to the y b to define the analysis in observation space 
for the Desro ziers et al.| ( |2005| ) method. 

Exam ples of studies that a p ply the so-called "Desrozier s diagnostic" to radianc e observation 



errors are 



Garand et al. (2007); 



Bormann and Bauer||2010| ); |Bormann et al.l^l OlpI Within an 



10 This interpolation is part of our observation operator. 

11 The later two references also apply two other methods, the so-called Hollingsworth/Lonnberg method 



( |Hollingsworth and Lonn berg 1986) and a method based on subtracting a scaled version of mapped assumed back- 
ground errors from FG-departure covariances. 
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enemble Kalman filter, |Li et al.| ( |2009| ) demonstrate how it is possible to continuously update both 
the observation and background error statistics based on jDesroziers et aL ( |2005| ). 

For radiance data assimilation, the Desroziers et al. ( 2005| ) approach includes representative- 



ness errors in the radiance errors. If we don't want to include this error component, and Rodgers 



(2000) would not, then we can project the observed radiances onto the subspace sampled by the 
model. To do this we would need to assume that the forward model is correct to transfer model 
states to radiances, but then we would only use the leading EOFs of a diverse sample of model 
states in radiance space to filter the observations. 



7. EOF properties 

Definition. Empirical orthogonal functions (EOFs) can be defined in terms of any state vector. 
For this description, suppose the state vector x is the list of temperatures at the levels used in the 
forward problem. But x could also include other profile parameters and surface parameters with 
no loss of generality. The EOFs are the eigenvectors of the x sample-covariance matrix. The EOFs 
are ordered by eigenvalue. The variance explained by each EOF is given by its eigenvalue. Thus, 
the first EOF is the anomaly "pattern" that occurs most frequently. 

Data compression. Since an EOF representation is often used in the retrieval to filter small 
scales, the retrieval solution can be represented equivalently (and with no loss of information) as a 
vector of EOF coefficient. In the case of our AER retrievals we use a single set of EOFs globally 
to avoid generating artificial edges in maps of the retrieved fields. For Mars TES, 4-6 EOFs are 
sufficient. Then the retrievals, error covariances, and AKs can be provided to the DAS in terms 
of the leading EOFs. If we reduce the size of the data vector from n (say 30) to j (say 5) we also 
reduce the size of the AK and error covariance matrices from 0(n 2 ) to 0(j 2 ). 

EOF algebra. EOFs are used to relate anomalies, denoted x! , from an overall mean, x. That is, 

x' = x — x. (17) 

In any implementation we first subract x before converting to EOF coefficients and then add it back 
when converting to physical space. The EOFs are truncated and ordered as columns in a matrix E. 
In what follows it is only important that E T E = I. But note that if all EOFs are retained than E T 
and E are inverses so that EE T = I as well. 

The projection of x on the EOFs gives the EOF coefficients a according to 

E T x' = a. (18) 

Reconstruction and filtering (in the case where we have truncated the EOF series) is obtained by 

Ea = x' f . (19) 



If we left multiple Eq. ( 19 ) by E T , then, since E T E = I, we find E 1 x'^ = a. Therefore E T (x' — x'j) 
0, demonstrating that the part of x' that is filtered does not project onto the retained EOFs. 
In summary, we made the following definitions: 
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• x' is the state vector [K]. (Note that this is a deviation from an overall mean.) 

• E is the matrix of EOFs [1]. 

• a is the vector of EOF coefficients [K]. 
We established these relationships: 

• E T E = I is the orthonormality relationship [1]. 

• E T x' = a is the projection method [K]. 

• Ea = x 1 is the filter or reconstruction method [K]. 
EOF covariances. The sample covariance matrix of x is 

S=({x- (x))(x- (x)) T ) =E({a-(a))-(a- {a)) T )E T = ESE T (20) 

since x — (x) = (x — x) — ((x) — x) = Ea — E(a). Here angle brackets denote an average over 
the sample. If the sample is the posteriori distribution then (x) = x, if the sample is the prior 
distribution then (x) = x a , and if the sample is the climate distribution then (x) = x and (a) = 0. 
If E includes all eigenvectors, then S = E T SE. Generally we truncate the EOF series so E is not 
full rank and as a result, if the retrieval is done in terms of EOF coefficients, then S will not be 
full rank either. However, we can still apply S = E T SE in this case to obtain the upper left corner 
of the full S matrix as desired f^j While Eq. (20) can be used to transform from EOF to physical 



space covariance it is better to add in the "noise" present in the truncated EOFs. For example, 



in the AER retrieval we determine the error covariance matrix of the EOF coefficients cj Since 
the coefficients of the truncated EOFs are not estimated during the retrieval we can add back the 
variability of these EOFs given by the associated eigenvalues. Computationally, take the diagonal 
matrix of eigenvalues, replace the upper left block with the estimated error covariance of the EOF 
coefficients from the retrieval scheme and apply Eq. ( [20] ) using the full (not truncated) matrix E. 



8. EOF analysis of the AK method 

Note that Eq. ([T]) is usually written for the full physical space quantities, but we can subtract 
x = Ax+ (I — A)x from Eq. ([I]) to get a version in terms of anomalies, 

x' =Ax' + (I-A)x' a + G y £ y . (21) 



We now project Eq. (21) into EOF coefficient space. To do this we filter each x! -vector. For 
justification, note that the filtering step is normally done within the retrieval algorithm initially and 
at each step of the iteration. Since putting too much structure in the background for the retrieval 



12 Partition E into retained and truncated EOFs [E r E t ] to see this. 
13 This is a full symmetric matrix. 
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can degrade the results, it is also helpful to filter the prior. Further, within the context of a retrieval 
in terms of EOF coefficients, that part of the true xf that does not project on the EOFs should be 
considered part of the representativeness error. 

To filter each /-vector we simply replace each x' by its filtered version EE T x' . Then, to project 
into the EOF space, we left multiply each equation by E T . In the case of Eq. (21 ) we obtain 

(E T E)(E T x') = (E T AE)(E T x f ) + [E T (I -A)E](E T x' a ) + E T G y £ y , (22) 

where we have collected terms in parentheses to indicate how to apply the above relationships. If 
we define 

A = E T AE (23) 

then we have 

a=Aa + (I-A)a a + £ a , (24) 

where 



£ a — E T Gy£y, (25) 



&=E 1 (x-x), (26) 
and 

a a = E T (x a -x). (27) 

We may say that A is the projection of A into the EOF coefficient space, or simply that A is the EOF 
AK, and that £ a is the projection of the retrieval error into the EOF coefficient space, or simply 
that £ a is the EOF retrieval error. 
Now £ a has covariance given by 

S m — E Gy(£y£y)Gy E = E T G y S £ Gy E = E T S m E. (28) 



As we did before (in 



4), we scale by S m 2 to rotate to a space where the observation errors are 



unbiased, uncorrelated, and have unit variance. Specifically we redefine the new observation as 

y A =S~i(6c-(I-A)a a ), (29) 
and the new observation operator by 

y A = Snha = S^AE T x' =A R (x-x), (30) 
where we have defined the "rotated" EOF-space AK, 

A r = S7JaE t , (31) 
for convenience. Now the observation increments 

i 

$A~yA =S m 2 £ a , (32) 
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are unbiased, and have covariance I. 

Note that we include the EOF projection operator in our definition of Ar in Eq. (31 ). This 
allows for the possibility of vertical localization and makes the implementation in EOF or physical 
space similar. Alternatively, one could determine a by projection (as in Eq. ([26]) and Eq. (|27|)), 



store rows of S m 2 A with the observation y& and then define the observation operator by first form 
in Eq. (30). This approach saves storage since it stores weights for EOF coefficients, not tem- 
peratures, but is not amenable to localization. Of course vertical localization may not be possible 
depending on the structure of Ar. 



9. Using the EOF representation for vertical interpolation 

Vertical interpolation inevitably introduces some errors. If we can somehow project the meteoro- 
logical model vertical structure directly onto the EOFs we could eliminate this source of error. In 

I I i _ 

this case we would use the first form of Eq. (30), S m 2 Aa. As described above we must interpolate 



the meteorological model values to the levels used in the retrieval and then project using Eq. ( 18 ). 
However there are some complications that make this unworkable when the grid definitions are 
very different, e.g., altitude relative to the geoid vs. sigma. What follows is an alternative approach 
(that, as a by-product, provides a method of vertical interpolation). 

In general we want to convert a vertical profile of model temperatures into the EOF coefficients 
defined relative to the retrieval vertical coordinate. For example, in the model, temperature might 
be stored as a-layer values, while the retrieval might operate on a fixed p-level grid. In this 
example, at some locations some of the p-levels might be below model topography. We require 
(and this is expected to always be the case) that there are more model temperatures in the profile 
than there are EOF coefficients. To convert from model temperatures, T, to EOF coefficients, a, 
we determine a to fit the T in a least squares sense. That is we minimize 

i = £(r i *-7}) 2 , (33) 

i 

where T\ are the background model temperatures and T* are the model temperatures reconstructed 
from the a. Here we restrict the 7} to just those model temperatures that can be reconstructed from 
a specification of the EOF coefficients. For example, any model temperatures above the uppermost 



temperature in the retrieval vertical coordinate would not be included in the sum in Eq. (33). 
The reconstructed temperatures are determined from a in two steps. (We don't actually perform 
these steps, but we must define them to solve the least squares problem.) First we reconstruct the 
temperature profile x in the retrieval vertical coordinate, 

x = Ea + x. (34) 

Second we interpolate to the model vertical coordinate using an interpolation operator V that is 
linear in the x, 

T* = Vx = V(Ea+x). (35) 
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Usually we will interpolate the temperatures linearly in pressure. Note that in the example de- 
scribed above, V will depend on the model surface pressure and vary from location to location. 
Instead of doing an actual interpolation, we must be able to determine the matrix V. To minimize 
J with respect to the a we set the derivatives equal to zero and find thaQ 

E T V T VEa = E T V T (T-Vx). (36) 

This is a linear equation of the form Ma = b where the size of M might be 6 x 6, but M will 



vary from location to location. Note in Eq. (36) that it is only the (presumably smooth) x that is 
vertically interpolated. 

Given the a determined this way we can then calculate x according to Eq. (|34|)p^1 This would 
fit into either the AK method or the EOF-AK method. 



10. Implementation details 

a. Underground when using pressure as the vertical coordinate 

If the radiative transfer model (RTM) for F uses a grid fixed in pressure (or in altitude) then 
the lengths of the vectors and the shapes of the matrices will vary from retrieval to retrieval as 
topography and surface pressure vary from location to location. We may simply truncate those 
parts of the vectors and matrices that are underground. Operationally, we may define underground 
as those levels whose corresponding rows of K = 0, i.e., levels that have no effect on the calculated 
radiances. Alternatively, we can keep all the vectors and matrices full size but take care that 

underground levels are handled properly. For example, underground rows of K must be zero or 

_i 

must be strictly ignored. In particular, underground elements of S m , S m 2 , and S~ l should be reset 
to zero after they are calculated to avoid the possible effects of accumulating round-off errors. 

When using EOFs, underground levels in A, G y or S m , and each x' should be zero. Equivalently 
the variables in Eq. ( |24~| ) will be assured to be correct by setting the rows of E below the surface to 
zero. 



b. Generalized matrix inversion 

Since S a may be poorly conditioned it will generally be necessary to use a truncated eigenvector 
decomposition to find the (Moore-Penrose) inverse of S a . This is not necessary if S is provided 
along with Jc. Alternatively, if we are given S £ , S a , and K, it is possible to sidestep S entirely by 

using Eq. (4) and Eq. Similarly to compute 5 m 2 from S m it may be necessary to use a truncated 
eigenvector decomposition. 

14 Using Einstein notation (summation implied for repeated indices), the reconstructed temperature at level i is 
written as T* = Vn ( (Ei ( „a n +xt). The derivative of this with respect to ccj may be written as EZV^. Then setting the 



derivative of J to zero gives = 2Ej k V^ (T* — 7}). Substituting for T* here and rearranging terms then gives Eq. ( 36 1. 

15 Note that this produces temperature values for all radiative transfer model grid points, even below the surface in 
some cases. Any values below the surface will not be used by the radiative transfer model, but by construction these 
will be statistically consistent with the rest of the profile. 
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The idea here is that after we rotate to the eigenvector coordinate system there are some coor- 
dinates that are so fiat (have such small eigenvalues) that they can be ignored. So we determine 
an inverse or square root inverse that is formally correct in the full but transformed space and then 
ignore (truncate) the near-singular dimensions. The inverse matrix in the truncated transformed 
coordinate system is in fact a proper Moore-Penrose pseudoinverse matrix in the original coordi- 
nate system. The correct choice of truncation would be the truncation used in the retrieval scheme. 
If an EOF truncation is not used in the retrieval scheme then the correct choice of truncation may 
be apparent from the eigenvalue spectrum. Otherwise it will probably be sufficient to choose the 
truncation such that at least 99.9% of the variance is retained, i.e., such that the sum of the retained 
eigenvalues is > 99.9% of the sum of all the eigenvalues. 

Eigen-decomposition. Any positive definite matrix 5, such as a covariance matrix, may be 
written as 

S = VGV T , (37) 

where V is a matrix of the eigenvectors and G is a diagonal matrix of the eignenvalues of 5. 
Inverse. The pseudoinverse of 5 is given by 

5+ = VG~ 1 V r (38) 

since 

SS + = VGV T VG l V T = VV T = 1. (39) 

Note that V T V = I regardless of the truncation, but VV T = I is true only for the case of no trunca- 
tion. When we truncate both V and G we can easily demonstrate that 5 + satisfies the four required 
Moore-Penrose properties— 55+5 = 5, S + SS + = 5+, (SS + ) T = 55+, and (S+S) T = 5+50 
Square root. The square root of 5 is given by 

52 = VGW T (40) 



since 

S2S2 = VGW 1 VGW 1 = VGV 1 = 5 (41) 



1 --' l -rTxr/~-h-irT iz^rT 



for any truncation. According to Eq. (38 ) the pseudoinverse of 52 is then 



1 i It 
'— iT/' 



S2 + = VG -IV 1 . (42) 
c. Interface with retrieval and order of computation 

Consider how data flows through our procedures for a moment. Given the retrieval, x, once we 
know A and S m , we then calculate y^, and Ar from Eq. ( 12), and Eq. ([14]). Usually, the retrieval 



will not provide A and S m . Three possible pathways for the calculation of A and S m are described 



16 From Eq. |39| we have that SS + = VV T which is symmetric, and reversing the roles of S and S + in Eq. (39 1 gives 
the same result. Thus SS + and S + S are symmetric and equal to VV T . Clearly pre- or post-multiplying S or S + by VV T 
leaves them unchanged, demonstrating the first and second properties. And because of symmetry the third and fourth 
properties hold. 
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Pathway 


1 2 3 


Interface 

Retrieval inputs 
Retrieval outputs 


x^K x^S x^S^A 


Calculations 

s q = 

Gy = 
A = 


KS a K T +S £ [Eq. 
S a K T S q 1 [Eq.ff 

G y K [EqM 
GySgGy [Eq. 10 


P 

1 - - 
I-SS a l [Eq.§ - 
] AS [Eq. 11| AS [Eq. 1 1 1 



Table 1 : Pathways to calculate A and S m . See text for discussion. 



now. (Refer to Table [T]) Inputs to the retrieval include x a , S a , and S £ . In some situations, these 
three inputs are all constant. The three sets of outputs from the retrieval are x, and (i) K, or (ii) S, 
or (iii) S and A. In the first pathway, from the retrieval inputs S a and S £ and the specified retrieval 
output K we can calculate in turn S q , G y , A, and S m from Eq. ([5]), Eq. (|4]), the first form of Eq. ([6]), 
and Eq. ( 10). This pathway requires only the inversion of S q , which should be well conditioned 
as mentioned before. In the second pathway, using the retrieval input S a , and the retrieval output 
S we can calculate A from the third form of Eq. (|6j) and then S m from Eq. (11). In this case we 
need to invert S a . As pointed out by Deeter et al.| ( |2003" ) in cases where x a and S a are fixed, this 
second pathway greatly reduces the data that must be provided by the retrieval and S~ 1 needs to be 
calculated only once. In the third pathway using the retrieval outputs S and A we can immediately 
calculate S m from Eq. (11). Note that in the second and third pathways the retrieval outputs are all 
sized by the number of retrieved quantities, and not the number of radiances. 

When using EOFs, the above discussion still holds, but now we also need x and E to define the 
EOF representation. Then once we have determined A and S m as before, we can then calculate in 
turn A, a, a a , S m , y A , andA R from Eq. J23}, Eq. (|26[, Eq. &7\, Eq. ([28]), Eq. §29^, and Eq. 



11. Algorithm implementation 

Here I sketch out one implementation for interactive retrievals using the AK-EOF method and the 
EOF vertical interpolation operator. 

1. Assemble known constants: instrument description (channels, radiance errors (Sgp^j), geom- 
etry), EOF information (EOFs (E), climate mean (x), truncation, eigenvalues), 

2. Build observation file with times, locations, radiances (y), geometry, 

17 This should be instrument noise plus forward problem errors 
instrument noise is often used. 



Rodgers 2000 Eq. 3.11). However in practice just 
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3. Use the background ensemble to calculate the prior mean and covariance, x a and S a , and add 
these to the observation file. Note that x a and x are different in this case 



4. Perform retrieval. Add x and S in the retrieval output file 

5. Post-process retrieval. Following pathway 2, calculate in turn A, S m , S m , A, a, a a , $a (from 
Eq. (29)), and Ar. Save $a and Ar in (LETKF) obs-data structure. The zth element of $a is 
one observation. It is associated with the zth row of Ar. 

6. Data selection. For vertical data localization use the absolute value of the zth row of Ar as 
weights in the same way that we now localize radiances. 

7. Vertical interpolation. After interpolating the model grids to the map location and time of the 
observation, use the procedure of §|9]to determine a and then x using Eq. (36) and Eq. (34). 



8. Simulate observation. Calculate according to Eq. (30) 
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The case outlined above is the most complicated one. Other implementations of the methods 
described here could follow this plan with appropriate simplifications. 
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